Best neural network for classification on large texts(entire movie subtitles)

by: mtbrands, 7 years ago


Hello, I'm trying to make a neural network which can classify the genre of movies by using their subtitles, but I am not sure which model to choose as a base for my network. Most models for text I have found are built around classifying single phrases and words, where as I want to use entire movies worth of subtitles. There's probably no time factor in the classification, so models based on changes over time are out. (Unless we also want to look at the timing of the subtitles but that would be for expanding.) It would be ideal if I could extract things like key phrases or words that have a high impact on the classification. I thought of a bag of words approach, but I want the model to look at context, not just at word counts. So that sentences start to play a role instead of single words.

I've followed the Python plays GTA tutorials for a bit so CNN's were my first thought, however these are mainly built for image sequences if I am right.

If anyone has any experience with something similar or has a suggestion please let me know!



You must be logged in to post. Please login or register an account.



For working with text data, generally, Recurrent Neural Networks are used, probably with LSTM or GRU cells. Convolutional Neural Networks are used, generally, more for image data.

-Harrison 7 years ago

You must be logged in to post. Please login or register an account.

Harrison's right: for data that's sequential in nature, such as words/letters of a text, RNNs are generally used, as they have the ability to, over time, figure out and predict the patterns of letters/words, and seem even to pick up on the occurrence of capitalization and punctuation... basically, anything that can be translated into tokenized time-series data (which written language kind of is), can be run through an RNN.

CNNs, on the other hand, are good at noticing clusters of patterns, such as how image data is represented. Still, with this being said, imagine that you can use the range of available pixel information (RGB data) to encode other information, such as the current state of a computer (CPU temp, fan speed, RAM, #applications, etc.), label that state, then train a CNN to recognize it. When your network recognizes a certain "state" of the machine, it could then take some sort of action based on how it classified the data.

There are also more complex variations built upon these two above, such as Generative Adversarial Networks (GANs) which generate items from scratch and Auto Encoders which compress information and can be used for translation as well, with the lowest, compressed layer becoming a unique representation of two languages. Auto Encoders are also used for style transfer or automated photo cleanup.

This is by no means an exhaustive list either, as research is rushing ahead due to multiple large companies and governments throwing vast sums of money into this area of AI research.

-GaoLaoWai 7 years ago
Last edited 7 years ago

You must be logged in to post. Please login or register an account.